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PREFACE 


On November 30, 1988, about 50 scientists and engineers from various NASA centers convened at 
Ames Research Center to participate in a workshop on Vision Science and Technology (VST). The goals 
of the workshop were to 

1. Refine the definition of Vision Science and Technology 

2. Identify NASA needs in VST 

3. Inventory NASA's expertise in relevant disciplines 

4. Foster communication among NASA groups 

5. Develop a strategy for NASA VST 

During the next 3 days, workshop participants delivered presentations on a wide range of research 
projects dealing with basic vision science or application of this science to NASA missions. Participants 
also took part in discussions on how NASA could address future needs in this area. This document is a 
report on that workshop, and a white paper on the future of Vision Science and Technology in NASA. 
We wish to thank all those who contributed to the success of the meeting and to the completion of this 

report. 


Andrew B. Watson 
Jeffrey Mulligan 
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SUMMARY 


This document attempts to provide a broad review of Vision Science and Technology within NASA. 
We have defined the subject, noted its applications in both NASA and the nation at large, and surveyed 
current NASA efforts in this area. We have noted the strengths and weaknesses of the NASA program, 
and have identified actions that might be taken to improve the quality and impact of the program. 

This area has enormous potential. We are entering the visual age, in which visual communication is 
the global lingua franca, and robotic vision is the front end to an ever increasingly automated tech- 
nology. At the intersection of computers, video, robotics, and imaging, a new science and technology is 
being bom and it will have great consequences for NASA. To fully exploit this technological revolution, 
the agency should be prepared to participate actively and to assume a leadership role. 

NASA TM- 102214 (Revision 1) replaces an earlier document that was inadvertently printed without 
three abstracts. 
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EXECUTIVE SUMMARY 


Through Vision Science and Technology (VST) researchers seek to understand the process of vision 
at the biological, physical, and mathematical levels, and to translate that understanding into practical 
advances in human factors, visual displays, image processing, and autonomous vision. 

VST is an important element of many national initiatives in science and engineering, such as High- 
Definition Television, Human Genome Project, Superconducting Super Collider, and Strategic Defense 
Initiative, as well as in the efforts to revitalize American industry through increased automation. 

VST is also an important element of many NASA programs and missions, such as Pathfinder, Space 
Station Freedom, National AeroSpace Plane, the Hypersonic Civilian Transport, Global Change Tech- 
nology Initiative, and Aviation Safety. In these programs VST serves in the acquisition and analysis of 
scientific data, in the prediction of human performance, and in the provision of autonomous vision 
capabilities. 

NASA currently supports a wide range of VST activities at a number of centers including Ames 
Research Center (ARC), Goddard Space Flight Center (GSFC), Lyndon B. Johnson Space Center (JSC), 
John C. Stennis Space Center (JSST), Jet Propulsion Laboratory (JPL), Langley Research Center 
(LaRC), and Lewis Research Center (LeRC). Much of this work has an excellent international 
reputation. 

The NASA effort in VST is of high quality, but the level of effort is insufficient to meet the require- 
ments of future NASA missions. The NASA program in VST could be strengthened sufficiently to meet 
these future challenges. Steps in this direction should include: explicitly acknowledging VST in plan- 
ning and funding; enhancing the complement of in-house researchers; encouraging selective excellence 
in a small number of VST areas; establishing an in-house center of excellence in VST; encouraging 
collaboration with universities and among centers; and adopting a long-term emphasis on fundamental 
work in VST to support future applications. 
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VISION SCIENCE AND TECHNOLOGY 


Vision Science and Technology describes a range of scientific and engineering areas that share a 
common goal of understanding visual processes in both human and machine. Visual processes extract 
information from imagery, both natural and synthetic. Visual processes can be examined and understood 
on many different levels, including biological, physical, and mathematical. Technical areas include fun- 
damental research on human, biological, and artificial vision, engineering of artificial vision systems, 
image processing, visualization technology, and visual human factors. Vision Science draws on a 
number of diciplines, including 

1. Psychology 

2. Human factors 

3. Neuroscience 

4. Neural networks 

5. Image processing 

6. Computer science 

7. Artificial intelligence 

8. Communications engineering 

9. Controls and guidance 

Vision Technology comprises those practical devices, systems, and techniques derived from Vision 
Science. These technologies can be grouped in the following way: 

1. Predictors of visual performance 

2. Visual displays and interfaces 

3. Image synthesis 

4. Image management 

5. Image analysis 

6. Artificial vision 

Specific examples of VST work under way within NASA are 

1 . Computational models of biological vision 

2. Computer vision algorithms and applications 

3. Image-processing algorithms and applications 

4. Telescience 

5. Remote sensing and analysis 

6. Visualization and human interfaces 

7. Neural networks for vision processing 

8. Robust color analysis algorithms 

9. Vision/sparse distributed memory for pattern recognition 

10. Optimal computational architectures for vision 

11. Assessment of human visual performance 

12. Human-matched image coding 

13. Helmet-mounted displays 
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14. Stereo displays 

15. Fundamentals of Vision Science 

Vision Science and Technology has experienced explosive growth in recent years. Computer vision 
is a burgeoning field, both because of its scientific challenges and because of its recognized importance 
in robotics and automation. There is also a torrent of information on the detailed neuroanatomy, neuro- 
physiology, and psychophysics of biological vision, and these data are being exploited in computational 
models of biological vision as well as in artificial vision systems. Vision also figures prominently in 
neural network research, and in research on massively parallel computer architectures. Computer inter- 
faces have become more graphic and “visual,” and there is a heightened interest in visualization of 
scientific information. In all of these areas, Vision Science and Technology is coming to the fore as a 
coherent and critical discipline. 

VST is, in the abstract, a powerful means of collecting and analyzing information about our physical 
surroundings. As such, it is natural that it will play an important role in an agency dedicated to research 
and discovery, on Earth and in space, by human and automated means. Vision Science and Technology 
is a vital element in NASA’s overall research program, as documented below. 


VST AND NATIONAL NEEDS 


Vision Science and Technology plays a prominent role, not only in NASA, but also in the national 
scientific and engineering enterprise. The United States is contemplating or is currently engaged in a 
number of large scientific, engineering, and industrial projects that depend upon VST. The following 
sections provide some examples. Dollars indicated are budget requests or allocations, and are provided 
only to suggest the relative size of various programs. 

High-Definition Television 


There is now a strong sentiment in Congress and in industry that the United States should undertake 
a vigorous effort to develop the next-generation high-definition television system. Envisioned as a sys- 
tem using extensive digital processing to achieve at least two times the current television resolution, it 
will involve extensive research in VST areas such as visual human factors, image coding and 
compression, and image processing. 


Earth Resources 


National and international concern for the environment and for monitoring natural resources is 
creating an expanded need for Earth resource monitoring systems. Planning is under way within NASA 
for a Global Change Technology Initiative (see below), but this concern also extends to other agencies 
such as National Oceanic and Atmospheric Administration (NOAA), U. S. Geological Survey (USGS), 
Department of Transportation (DOT), Department of Defense (DOD), Environmental Protection Agency 
(EPA), Department of Agriculture, and private industry. Processing and visualization of Earth resource 
data is vital if the information is to be useful, and this involves extensive use of VST. 
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The Human Genome Project (M$100 FY90) 


With the goal of mapping the entire human genetic code, this project is jointly managed by the 
National Institute of health (NIH) and the Department of Energy (DOE) and is anticipated to be one of 
the largest single scientific endeavors in the coming decades. The project can be accomplished only 
through extensive automation of tasks that are currently accomplished by human perception. It will 
therefore demand extensive use of automated visual pattern recognition and will require extensive 
advances in this area. 


Superconducting Super Collider (M$160 FY90) 

This high-energy physics project will also depend on automated pattern recognition systems for 
process monitoring and event detection. In addition, high-energy physics in general has come to depend 
upon advanced visualization techniques to render accessible massive data sets. 

Industrial Revitalization 

As we approach the 21st century, there is a need for revitalizing American industry. This will require 
greatly expanded automation of the manufacturing process, and this automation will involve extensive 
use of robotic vision. 


Strategic Defense Initiative (M$5,591 FY90) 

Initially envisioned as a space-based defensive shield against nuclear attack, this plan was dependent 
upon automated visual monitoring and identification of potential threats. The visual technologies called 
for were in fact well beyond what is currently possible. While the direction of this program is in some 
doubt, it seems certain that the national defense will move toward automated monitoring systems that 
depend on vision technology. 


National Defense (M$315,000 FY90) 

The military early recognized the potential of autonomous vision in enhancing the safety and 
effectiveness of military operations. The Defense Advanced Research Projects Agency (DARPA) has 
been the single largest source of funds for computer vision research in the last several decades, and 
continues to fund large applications of VST such as the Autonomous Land Vehicle(ALV). A recent 
“DOD Critical Technologies Plan” prepared for the Congressional Armed Services Committees, 
identified 22 specific technologies with extraordinary urgency or promise. Of these, five involve VST: 
machine intelligence/robotics, integrated optics, passive sensors, automatic target recognition, and data 
fusion. 


Nuclear Power 

Both generation of nuclear power and research on nuclear fission and fusion (M$349 FY90) employ 
remote and robotic vision for the monitoring of hazardous processes. 
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These examples illustrate the degree to which VST is a critical and growing component of our 
nation’s scientific, economic, and military enterprise. Vision is an extraordinary system whereby 
humans absorb information about their world, in order to predict and control complex dynamic pro- 
cesses. As the technology matures, we are beginning to equip our machines, our organizations, and our 
nation with this powerful capability. 


VST AND NASA NEEDS 


Vision Science and Technology is a vital element in NASA’s overall research program. It is a key 
element of virtually all future autonomous exploration activities, such as autonomous rovers, sample 
acquisition, and autonomous landing and docking. It has been and continues to be among the most 
critical elements of Space and Aviation Human Factors. And it is the leading edge in efforts to develop 
next-generation computational human factors for aerospace systems. Here we briefly document the 
critical role that is now or soon will be played by VST in a number of specific NASA programs. 

Pathfinder 

The Pathfinder program was designed to accelerate the development of critical technologies for 
advanced space missions. Of the 16 elements, 9 are paced by advances in VST. We examine the Path- 
finder program in some detail, because it indicates the advanced technologies that NASA views as 
essential, and it illustrates how deeply VST is embedded in these technologies. 

Planetary Rover 

Autonomous exploration of planetary surfaces will require extensive advances in visual guidance 
and obstacle detection. Current accomplishments in autonomous land vehicles, such as the DARPA 
ALV project and the JPL rover work, fall far short of needs. 

Sample Acquisition 

Automomous identification of sites of geological interest, and analysis of collected samples, are 
essentially problems in visual pattern recognition. Both will use visible wavelength image data, as well 
as other methods. 


Autonomous Rendezvous and Docking 

Although these tasks may be achieved primarily through preplanned maneuvers and guidance 
telemetry, it seems likely that additional visual monitoring systems will be provided, to detect hazards 
and other unusual situations. The advantage of vision is that it can monitor the entire situation, not just 
those few signals deliberately provided. 
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In -Space Construction 


Construction of large complex structures will require robots with sensing capability. The order of 
vision capability envisioned for these robots does not yet exist and will require extensive research. 

Extravehicular Activity 

Current activity focuses on high-pressure suits, but future work will necessarily turn to EVA infor- 
mation systems, displays, and controls. For example, design of helmet-mounted displays is highly 
dependent on principles derived from VST. 

Human Performance 

Design of the Human-Machine Interface has traditionally relied upon information about visual 
capacities of the user. This is becoming even more so with the advent of high-resolution, highly pro- 
grammable, color displays, and with the approach of practical virtual environment displays. There is a 
general trend in display design toward a more graphic approach, and, consequently, a greater need for 
understanding human visual information processing. 

Closed Environment Life Support 

Closed life support systems will require augmented health monitoring to guard against toxic effects. 
Visual function testing is emerging as a promising technology for early detection of neural toxicology. 

Autonomous Lander 

Current technology is not adequate for fully autonomous landings. Vision technology is the leading 
candidate for guidance and hazard detection. Work on this application is currently under way within 
NASA. 


Fault-Tolerant Systems 

This element is concerned with photonic processing in terrain-analysis problems. Elements of VST 
such as image processing, pattern recognition, and parallel computation are fundamental. 

Global Change 

The Ride Report to the NASA Administrator on the future of the space program proposed a 
“Mission to Planet Earth,” a concerted effort to study global change. In addition, national and interna- 
tional concern for the environment and for the monitoring of natural resources is creating an expanded 
need for Earth resource monitoring systems. Planning is under way within NASA for a Global Change 
Technology Initiative to augment the technologies that are critical to Earth observation. These technolo- 
gies have been categorized as observation, information, and operation. The first two involve sensors and 
the processing, integration, and visualization of sensor data, and in all of these activities, VST plays an 
important role. 
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Space Station Freedom 


Space Station Freedom (M$2,050 FY90) will require many of the technologies identified under 
Pathfinder, but it also has its own special challenges. Viewed in large measure as a site for extrater- 
restrial research, it will be the first large-scale effort toward telescience. The capability to conduct 
science at a distance is absolutely dependent on the ability to monitor events and either convey infor- 
mation visually to remote observers or automatically recognize critical events. As such, it will require 
extensive means for image capture, processing, storage, transmission, and display, and possibly for some 
autonomous vision. 

The image demands of telescience, coupled with operational and informational needs of station 
occupants and those on the ground, create the need for general image management capabilities. The 
technology needs here are in areas such as image coding and compression, and image display. 

Finally, the human crew must be augmented by automatic visual monitoring of station and 
environment. This is an instance of robotic vision, which may be more extensively used in later 
evolution of the station. 


National AeroSpace Plane 

National AeroSpace Plane (M$556 FY90), a joint effort of NASA and the U. S. Air Force to develop 
a hypersonic spacecraft that could take off and land like a conventional aircraft, will require remote 
viewing systems because of cockpit location and surface angles. It may also require autonomous vision 
systems to assist in guidance, deployment, and docking. 

Aviation Safety 

Aviation safety is an important element in NASA's goals and objectives. VST contributes to this 
element in two important ways. The first is in predicting human visual performance in guiding the craft, 
monitoring the outside world, and extracting information from displays. The second contribution is 
toward the design of cockpit displays, and more generally, toward crewstation design. This latter contri- 
bution is of ever greater importance as a revolution takes place in cockpit displays, in the form of color, 
CRT displays, multifunction displays, and advanced visualization techniques such as graphic flight 
directors and displays of air traffic. 


Base Research and Technology 

The preceding paragraphs describe large programs directed at specific technical or mission objec- 
tives. Beyond these, the base NASA research and technology program contains major components that 
involve VST. Examples in the Office of Aeronautics and Space Technology (OAST) are research in 
computer science and artificial intelligence, especially with respect to image processing, image analysis, 
and computer vision, and human factors, especially with respect to visual sensitivity, visual information 
processing, and visual displays. In the Office of Space Science and Applications (OSSA), examples are 
research on visual-vestibular interactions, and effects on vision of long-duration spaceflight. 
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The key technical areas are 

1. Information display (display design, information formatting) 

2. Image coding (image compression, storage, transmission) 

3. Image analysis (scientific analysis, computer vision) 

4. Human vision (data, models, guidelines) 

5. Image synthesis (visualization, graphics, simulation) 


NASA RESEARCH CENTERS 


The material presented at the workshop, and other sources, demonstrate considerable expertise in 
VST at a number of NASA research centers. In this section we provide a brief survey of these centers. 
This survey is not exhaustive, and we apologize for ommisions. 

Ames Research Center 


Ames Research Center has VST activity under way in several areas. Codes FS and FL are collabo- 
rating on computer vision techniques for autonomous guidance of both the rotorcraft and the Mars 
lander. Code FL has a substantial group working on human and biological vision and its relation to 
image processing, display design, and computer vision. This group has extensive collaborations with 
vision scientists at Stanford, Berkeley, University of California, Irvine, University of California, 

Santa Cruz, and SRI (Menlo Park and Samoff). The Research Institute for Advanced Computer Science 
(RIACS), situated at Ames, is engaged in collaborative work with code FL on visual recognition and 
neural networks. Code RI is initiating a program in photonics research, including optical processors for 
visual recognition. Code RI also hopes to use vision technology for remote sample acquisition and 
analysis. Ames (code FL) has also played a leading role in technology utilization projects to use image 
processing in visual prosthetics. 

Principal researchers include 


Albert Ahumada, ARC-FL 
Victor Cheng, ARC-FS 
Mai Cohen, ARC-SL 
Dallas Denery, ARC-FS 
Steve Ellis, ARC-FL 
Scott Fisher, ARC-FL 
Mary Kaiser, ARC-FL 
James Larimer, ARC-FL 
Mike McGreevy, ARC-FL 
Jeffrey Mulligan, ARC-FL 
Ellen Ochoa, ARC-RI 
Don Rosenthal, ARC-RI 
Bonavar Sridhar, ARC-FS 
Andrew Watson, ARC-FL 


John Perrone, Stanford 
Mike Raugh, RIACS 
Lee Stone, FL-NRC 
Matt Valeton, FL-NRC 
Brian Wandell, Stanford 
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Goddard Space Flight Center 


Goddard has recently established a Center for Excellence for Space Data and Information Sciences 
that may be involved in image analysis research. 

Principal researchers include 

J. Gualtieri, GSFC-RRA 
M. Manohar, GSFC 

Milt Halem, GSFC-Space Data and Computing 

Jet Propulsion Laboratory 


Jet Propulsion Laboratory has a substantial program of research in vision for robotics, which 
emphasizes technology for autonomous space operations. An example is work on stereo vision for 
autonomous rovers. JPL also has several experts on human and biological vision. This work has 
contributed to both robotics and human interface design, as well as prostheses for low vision. 

JPL also has a section devoted to Image Processing Applications and Development, which has 
produced the SPAM package. 

Other imaging science and technology is conducted in Section 34, which deals with the HIRIS 
infrared imaging system among other things. 

Principal researchers include 

Brian Wilcox, JPL 

Donald Genery, JPL 

Richard Anderson, JPL 

Teri Lawton, JPL 

Dan Diner, JPL 

Ray J. Wall, JPL, Image Processing Applications and Development Section 

Lyndon B. Johnson Space Center 

Lyndon B. Johnson Space Center (JSC) has a number of research efforts concerned with vision. 
There is work in the areas of photonics, image processing, and human factors. The photonics work is in 
early stages, and will depend somewhat on NASA planning efforts now under way. The image 
processing work has led to the development of a general spatial remapper that has applications in 
guidance, machine vision, and prosthetics for low vision. 

Principal researchers include 

Richard D. Juday, JSC 

Dave Loshin, Univ. of Houston, School of Optometry 

Marianne Rudisill, JSC 
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John Stennis Space Center 


John Stennis Space Center has expertise in image processing, traditionally applied to Earth resource 
issues, which they hope to apply as well to prostheses for low vision. They are currently collaborating 
with Johns Hopkins University on an image processing project. 

The principal researcher is Doug Rickman, JSSC, Earth Resources Lab. 

Langley Research Center 

Langley Research Center has an active and widely recognized program that concentrates on issues of 
image acquisition, processing, and display. The emphasis of this program is on the development of 
focal-plane processing techniques and technologies to effectively combine image gathering with coding. 
They have extensive collaborations with Odetics, Inc., Microtronics, Inc., and University of California, 
Irvine. A notable product of this collaboration is the development of the Intensity Dependent Spread 
(IDS) model of retinal processing in a commercially available hardware implementation for image- 
processing systems, and in a neural network implementation that combines photon detection with 
asynchronous parallel processing. 

Principal researchers include 

Friedrich O. Huck, LaRC, Information Systems Division 

Rachel Alter-Gartenberg, Old Dominion University 

Tom Comsweet, University of California, Irvine 

Darryl D. Coon, Microtronics Associates, Inc. 

Carl L. Fales, LaRC, Information Systems Division 

Daniel J. Jobson, LaRC, Information Systems Division 

Sarah John, Science and Technology Corp. 

Eleanor Kurrasch, Odetics Inc. 

Ramkumar Naranswamy, Science and Technology Corp. 

George B. Westrom, Odetics Inc. 


Lewis Research Center 

A major focus of Lewis Research Center in vision and imaging is HHVT (High-Resolution High- 
Frame Rate Video Technology). The goal of this program is to assess needs and develop technology for 
video monitoring of microgravity experiments. It is expected that results will transfer to more general 
telescience applications. Major activity to date has concentrated on designing the imaging system, 
evaluating user requirements, and developing data compression techniques for onboard archiving and 
data transmission. 

Image data compression techniques are also being developed to support lunar and planetary 
exploration missions to reduce communications channel bandwidth requirements for transmission of 
high-resolution images. 
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Principal researchers include 

Robert Butcher, LeRC 

Mike Lewis, LeRC 

Marlene Metzinger, LeRC 

Mary Jo Shalkhauser, Lerc 

William Thompson, LeRC 

Wayne White, LeRC 

John Zuzek, LeRC 

William Hartz, Analex 

Khalid Sayood, U. Nebraska- Lincoln 


STATUS OF NASA VST 


To determine a future direction for NASA VST it is important to understand where we are now. In 
the previous section we described specific research projects now under way. Here we examine in a more 
general way the strengths and weaknesses of NASA's current VST program. 

Strengths 

Individuals and research- The most prominent strength of the existing NASA program is the quality 
of the individuals and their research. This is a strong base on which to build. A more detailed survey of 
NASA activities is given elsewhere in this report, but here we note that in a number of key areas, such as 
image processing, computational models of human and biological vision, motion processing, stereo, 
autonomous guidance, image coding and compression, and advanced visual displays, there exists a core 
of knowledge, achievement, and excellence. The quality of these efforts is objectively verified by 
numbers of journal publications and invited conference presentations. 

Program attractiveness- Another strength that NASA brings to this challenge is its attractiveness, 
relative to industry and military programs, to talented researchers. While salaries and working conditions 
make it difficult to attract and retain the best people, particularly at the senior levels, many individuals 
are willing to sacrifice in order to work on “the final frontier.” NASA has always prided itself on explor- 
ing technologies that are at the forefront of what is possible. Elsewhere, we have shown that VST per- 
meates much advanced research. VST is, in addition, one of the more high-profile, glamorous facets of 
industrial and academic research and development. These facts recommend it as a suitable investment of 
NASA resources. 

The center laboratories offer researchers an environment well-suited to support leading-edge 
research in many fields. The proximity of scientists and engineers working in related fields such as 
biology, computer science, and numerical analysis foster an atmosphere similar to that of leading 
research universities. The application-specific goals, a number of which have been enumerated above, 
bring excitement and energy to the endeavor. NASA scientists and engineers are in a unique position to 
bridge efforts in basic and applied research. This contributes significantly to the agency’s ability to 
mount high-quality programs in VST. 
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Facilities- Advanced computational resources, such as powerful graphics workstations, image- 
processing systems, supercomputers, and parallel computers, are essential to VST, and NASA excels in 
this regard. NASA also has a number of specialized facilities, such as the Vision Laboratory at Ames 
Research Center, and image processing labs at JPL, Goddard and JSSC, and robotics lab at JPL, that 
contain unique resources. 


Weaknesses 

Lack of identity- With few exceptions, current NASA VST research projects are carried out under 
the umbrella of other programs, rather than supported as VST per se. The result is that researchers are 
frequently diverted to satisfy the explicit demands of the umbrella program. This is inefficient, and 
serves neither program well. VST will be pursued most effectively and enthusiastically when it is 
pursued single-mindedly. 

Short-term emphasis- Another problem that must be acknowledged is the short-term orientation of 
many of NASA's research programs. Despite its mandate to pursue “long-lead, high-risk” research, 
NASA has difficulty in mustering the political and economic courage to support such programs. While 
VST may provide a number of near-term payoffs, it is a rich and difficult area that demands considerable 
long-term research investment. Perhaps the clearest example is in computer vision, where extraordinary 
benefits are almost guaranteed, but where current technology is crude, laborious, and ineffective. 

Small numbers- Despite the magnitude of the vision problems that confront the agency, as described 
above, there is remarkably little in-house expertise. The agency has committed large sums of money to 
university research programs to assure that the base research and technology will be there when it is 
needed. But it should be evident that without in-house knowledge, the agency can neither adequateley 
select and monitor extramural research, nor will it be able to apply that research to NASA missions. It is 
clear that NASA cannot be effective in this area without additional investment in its in-house capability. 

Connections- As Dr. Fletcher has noted, “NASA will never have enough money to do all the things 
it wants to do." It must leverage developments that occur in the industry, academia, and other national 
labs. This is undoubtedly true of VST, in the sense that the challenge is so large it cannot reasonably be 
undertaken by NASA alone. But the exploitation of extramural research requires that NASA researchers 
establish and maintain connections with the larger VST community. They can do this only if they have a 
quality VST program that is recognized by the larger community. In short, external expertise cannot be 
exploited unless there is internal expertise. 


FUTURE OF NASA VST 


We have reviewed the role of VST in achieving national and NASA goals, and we have examined 
the strengths and weaknesses of the current NASA VST program. In the following section we attempt to 
provide some specific proposals that may serve to build a sound and effective VST program for the 
future. 
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Education and Advocacy 


Perhaps the most important single step that can be taken is to acknowledge VST as a coherent disci- 
pline that is central to NASA's goals. This is a matter of education and advocacy. It is important that 
those responsible for planning the direction of NASA programs understand the term VST, and that it 
become as familiar as “propulsion,” “structures,” “communications,” or “artificial intelligence." This 
educational process is largely the responsibility of the VST community within NASA, and this 
document is a first step. 


VST Program Support 


The second step is to provide explicit program funding for VST. NASA Headquarters (HQ) should 
identify, hopefully with the aid of this document, research areas that fall within the purview for this 
funding program. A PASO should be generated, and centers should be invited to submit Research and 
Technology Objectives and Plans (RTOPs) to this program. 

Research Planning 

More consideration must be given to VST in planning specific science and technology research 
programs. Pathfinder is an example of a program in which VST needs are demonstrably great, but are 
not adequately dealt with. 


Long-Term Emphasis 

As noted previously, NASA efforts are hampered by a demand for short-term results and applica- 
tions. HQ should have the political and economic courage, and the wisdom, to regard VST as a long- 
term research program. It is relatively inexpensive, and the potential benefits are very large, but we 
cannot expect excellence in the NASA program unless we invest for the future. 

Quality 


Too often NASA programs are judged on how much they produce, in terms of demonstrations, 
viewgraphs, and other superficial creations, or in terms of how directly they impact an immediate NASA 
need. This area, we submit, is one which should be judged in large part in terms of simple quality. Since 
much of it is long-term research, the surest criterion of ultimate benefit is excellence, judged by the con- 
ventional criteria of scientific research. HQ can monitor and reward excellence by noting refereed publi- 
cations, invited presentations, memberships on distinguished committees, collaborations with eminent 
scientists, and the like. Quality is also encouraged by greater interaction of NASA personnel with each 
other and with the larger scientific/engineering community (see below). 

Selective Excellence 

NASA should foster excellence in a small set of research areas in which it can make a significant 
contribution. These areas should be chosen with respect to relevance to NASA goals and to the existing 
expertise of NASA groups, and to their long-term scientific and technology promise. Although it is 
hazardous to offer specifics, the following are candidates: 
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1 . Image processing (Earth resource imagery, astronomy, image management) 

2. Computational models of human and biological vision 

3. Motion processing 

4. Stereo processing 

5. Autonomous visual guidance and hazard avoidance 

6. Image coding and compression 

7. Advanced visual displays 

8. Visual human factors 

9. Neural networks for visual processing 

10. Human factors for visualization 


Center of Excellence 

HQ should consider, within the next decade, establishment of a Center of Excellence in VST. The 
center should capitalize on existing expertise at the research centers, and should be established in col- 
laboration with a major university. The center should take advantage of existing NASA supercomputer 
resources. The center could perhaps be jointly funded by NASA, DOD, and industry. 

Extramural Collaboration 

To leverage NASA’s internal research, strong collaborations with industrial and academic labs 
should be strongly encouraged. University research supported in this way is much more effective than 
grants that are merely “monitored.” Specifically, this means that intramural funding should be sufficient 
to allow research groups at the individual centers to fund collaborative activities. 

Intracenter Collaboration 

Headquarters and individual centers should encourage collaboration and communication among the 
various centers. There are several mechanisms for accomplishing this 

1. an annual intercenter workshop, such as those held at Ames in November 1988, and at Langley 
in May 1989 

2. a common source of HQ funding for VST 

3. a single program review for all VST research activities 

We have already begun to develop a directory of individuals involved in NASA VST (see 
Appendix). We hope that this will be useful in planning and collaboration. 
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ABSTRACTS 


The following are abstracts of presentations given at the Vision Science and Technology Workshop 
held at Ames Research Center, November 1988. 
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N90- 22217 


Sampling and Noise in Vision Networks 


Albert J. Ahumada, Jr. 

NASA Ames Research Center 

This research is part of the Human Interface Research Branch-Vision Group's program to develop 
computable models of biological solutions to general vision system problems. Two problem areas are 
addressed: (1) effects of discrete sampling by receptors and (2) effects of visual system noise. 

1. Image sampling 

This research program originated as a collaboration with J. Yellott of UC Irvine on the question of 
why aliases are not seen in our normal vision as they are in images from other sampled systems. It has 
developed into a collaborative program with Yellott on the consequences of image sampling with the 
constrained sampling array disorder found in the retina. 

A. Retinal cone arrangement models (Ahumada and Poirson, 1987). A hard disk packing algorithm 
with random variation in disk diameter can generate sampling arrays with the same sampling properties 
as the primate central fovea. 2. Peripheral cone positions (Yellott, 1983). The sampling properties of the 
peripheral cones are well represented by those of a Poisson hard disk process. 3. Red-green cone 
arrangement (Ahumada, 1987a). Simulated annealing allows models for the arrangement of the red and 
green cones to vary in disorder up to the maximum amount of order allowed by the cone array. 

B. Receptor position learning models. Learning mechanisms can copy the detailed arrangement of 
receptor positions to higher levels of the visual system. 1. Weight adjustment models (Ahumada and 
Yellott, 1988a). Kohonen-like competitive learning models can construct topologically correct maps. 2. 
Position adjustment (Ahumada and Yellott, 1988b). Error-correcting position adjustment can generate 
maps with exact local position information and variable global magnification. 

C. Interpolation network learning models. Interpolation of the sampled image allows further 
processing to ignore the details of the sampling. 1. Chen-Allenbach interpolation (Yellott, 1988). A 
generalization of their least squares interpolation works with both irregular spacing and variable density. 

2. Network learning (Ahumada and Yellott, 1988b). A network implementable gradient descent learning 
algorithm allows the required matrix inverting network to be learned by spontaneous activity and self- 
generated feedback. 3. Related computational methods. Related algorithms can provide very efficient 
inversion of sparse matrices. They are also useful for inverting image encoding transformations. 

II. Visual system noise limiting signal detection. 

This research has been a collaboration with A. Watson and K. Nielsen at Ames, B. Wandell at 
Stanford, and D. Pelli at Syracuse. Matrix methods of signal processing are applied to visual models. 

A. Equivalent noise of linear models 1. Spatial noise (Ahumada and Watson, 1985) Low contrast 
detection and recognition models can be represented either by a single filter with white noise or by an 
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equivalent image noise. 2. Temporal noise (Watson, 1988) This concept is extended to the continuous 
temporal domain to provide a rational definition of neuronal signal detectibility. 

B. Equivalent spatial noise of a nonlinear model (Ahumada, 1987b) For high contrast signals, 
masking dominates which can be represented as nonlinear signal compression or as stimulus-induced 
visual system noise. 
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Networks for Image Acquisition, Processing, and Display 


Albert J. Ahumada, Jr. 

NASA Ames Research Center 

The human visual system comprises layers of networks which sample, process, and code images. 
Understanding these networks is a valuable means of understanding human vision and of designing 
autonomous vision systems based on network processing. Ames Research Center has an ongoing 
program to develop computational models of such networks. 

The models predict human performance in detection of targets and in discrimination of displayed 
information. In addition, the models are artificial vision systems sharing properties with biological 
vision that has been tuned by evolution for high performance. Properties include variable density sam- 
pling, noise immunity, multi-resolution coding, and fault-tolerance. The research stresses analysis of 
noise in visual networks, including sampling, photon, and processing unit noises. 

Specific accomplishments include: 

• Models of sampling array growth with variable density and irregularity comparable to that of the 
retinal cone mosaic 

• Noise models of networks with signal-dependent and independent noise 

• Models of network connection development for preserving spatial registration and interpolation 

• Multi-resolution encoding models based on hexagonal arrays (HOP transform) 

• Mathematical procedures for simplifying analysis of large networks 

This program has resulted in six papers published or in press during the last year. Portions of this 
work were done in collaboration with Stanford University. 
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Parallel Asynchronous Hardware Implementation 
of Image Processing Algorithms 


D. D. Coon and A. G. U. Perera 
Microtronics Associates 

Research is being carried out on hardware for a new approach to focal plane processing. The hard- 
ware involves silicon injection mode devices. These devices provide a natural basis for parallel asyn- 
chronous focal plane image preprocessing. The simplicity and novel properties of the devices would 
permit an independent analog processing channel to be dedicated to every pixel. A laminar architecture 
built from arrays of the devices would form a two-dimensional (2-D) array processor with a 2-D array of 
inputs located directly behind a focal plane detector array. A 2-D image data stream would propagate in 
neuron-like asynchronous pulse-coded form through the laminar processor. No multiplexing, digitiza- 
tion, or serial processing would occur in the preprocessing stage. High performance is expected, based 
on pulse coding of input currents down to one picoampere with noise referred to input of about 10 fem- 
toamperes. Approximately linear pulse coding has been observed for input currents ranging up to seven 
orders of magnitude. Low power requirements suggest utility in space and in conjunction with very large 
arrays. Very low dark current and multispectral capability are possible because of hardware compati- 
bility with the cryogenic environment of high performance detector arrays. 

The aforementioned hardware development effort is aimed at systems which would integrate image 
acquisition and image processing. Acquisition and processing would be performed concurrently as in 
natural vision systems. A key goal of the research will be hardware implementation of algorithms, such 
as the intensity dependent summation algorithm and pyramid processing structures, which are motivated 
by the operation of natural vision systems. Implementation of natural vision algorithms could benefit 
from the use of neuronlike information coding and the laminar, 2-D parallel, vision system type archi- 
tecture. Besides providing a neural network framework for implementation of natural vision algorithms, 
a 2-D parallel approach could eliminate the serial bottleneck of conventional processing systems. Con- 
version to serial format would occur only after raw intensity data has been substantially processed. An 
interesting challenge arises from the fact that mathematical formulation of natural vision algorithms does 
not specify the means of implementation, so that hardware implementation poses additional questions 
involving vision science. 
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Visions of Visualization Aids: 

Design Philosophy and Experimental Results 


- 222 ?° 


Stephen R. Ellis 
NASA Ames Research Center 

Aids for the visualization of high-dimensional scientific or other data must be designed. Simply cast- 
ing multidimensional data into a two- or three-dimensional spatial metaphor does not guarantee that the 
presentation will provide insight or parsimonious description of the phenomena underlying the data. 
Indeed, the communication of the essential meaning of some multidimensional data may be obscured by 
presentation in a spatially distributed format. 

Useful visualization is generally based on pre-existing theoretical beliefs concerning the underlying 
phenomena which guide selection and formatting of the plotted variables. Two examples from chaotic 
dynamics are used to illustrate how a visualization may be more than a pretty picture but rather an aid to 

insight. 

Dynamic visual displays can help understand how simulation parameters change with time and con- 
ditions but purely visual analysis is dependent upon a subjective perceptual assessment of the display. 
The hope is that a viewer can see new phenomena in the map of the data space that the display provides. 
The presumption that simple visual inspection of the displays will provide insight into the simulation, 
and especially reveal new phenomena, however, assumes the displayed images will be visually compre- 
hensible. This comprehensibility, however, depends upon the appropriateness of the selections of axes 
and the inherent dimensionality of the phenomena to be uncovered. More specificity, if the display is to 
be more illuminating than confusing, at least its dimensionality must match the dimensionality of the 
phenomena. Anyone who has ever seen a dynamic two-dimensional projection of an irregularly 
tumbling four-dimensional cube will quickly appreciate the thrust of this requirement. 

Visualization tools are particular useful for understanding inherently three-dimensional databases 
such as those used by pilots or astronauts during aircraft or spacecraft maneuvers. Two examples of dis- 
plays to aid spatial maneuvering will be described. The first, a perspective format for a commercial air 
traffic display, illustrates how geometric distortion may be introduced to insure that an operator can 
understand a depicted three-dimensional situation. The second, a display for planning small spacecraft 
maneuvers, illustrates how the complex counterintuitive character of orbital maneuvering may be made 
more tractable by removing higher-order nonlinear control dynamics, and allowing independent 
satisfaction of velocity and plume impingement constraints on orbital changes. 
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Vision Science and Technology for Supervised Intelligent Space Robots 


Jon D. Erickson 
NASA Johnson Space Center 

We believe that robotic vision is required to provide the rich, real-time descriptions of the dynamic 
space environment necessary to enable the intelligent connection between machine perception and action 
needed for free-flying robots that do real work in space. We also believe that space in low Earth orbit 
offers simplicities of only a few objects with these being man-made, known, and of cooperative design. 

The focus of our recent work in robotic vision for application in intelligent space robots such as 
EVA Retriever is in visual function, that is, how information about the space world is derived and then 
conveyed to cognition. The goal of this work in visual function is first to understand how the relevant 
structure of the surrounding world is evidenced by regularities among the pixels of images, then to 
understand how these regularities are mapped on the premises that form the primitive elements of cog- 
nition, and then to apply these understandings with the elements of visual processing (algorithms) and 
visual mechanism (machine organization) to intelligent space robot simulations and test beds. Since 
visual perception is the process of recognizing regularities in images that are known on the basis of a 
model of the world to be reliably related to causal structure in the environment (because perception 
attaches meaning to the link between a conception of the environment and the objective environment), 
our work involves understanding generic, generally applicable models of world structure (not merely 
objects) and how that structure evidences itself in images. Causal structure is of interest so as to be able 
to predict consequences, anticipate events, and plan actions so as to achieve goals. 

Despite a focus on visual function, the majority of the resources expended to date have gone into 
implementation of visual processing and visual mechanism to meet test bed requirements for determin- 
ing object holdsite grasping with dexterous hands and for free-flying navigation with obstacle avoidance. 
Our test bed includes laser range imagers as well as multiple visible and infrared video cameras. 
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Computation and Parallel Implementation for Early Vision 


J. Anthony Gualtieri 
NASA Goddard Space Flight Center 

The problem of early vision is to transform one or more retinal illuminance images — pixel arrays — 
to image representations built out of primitive visual features such as edges, regions, disparities, 
clusters, .... These transformed representations from the input to later vision stages that perform higher 
level vision tasks including matching and recognition. We have developed algorithms for: 1) edge 
finding in the scale space formulation; 2) correlation methods for computing matches between pairs of 
images; and 3) clustering of data by neural networks. These algorithms are formulated for parallel imple- 
mentation of SIMD machines, such as the MPP (Massively Parallel Processor), a 128 x 128 array pro- 
cessor with 1024 bits of local memory per processor. For some cases we can show speedups of three 
orders of magnitude over serial implementations. 

(1) Edge Detection in Scale Space [M. Manohar and J.A. Gualtieri, in preparation] 

Edge focusing [Bergholm, PAMI-9, 726-741 (1987)] generalizes standard edge detection approaches 
by performing edge detection at a series of scales and tracks the edges from coarse to fine scales. We 
begin with the Canny edge detector [J. Canny, PAMI-8, 678-698 (1986)] which convolves the image, I, 
with the gradient of a Gaussian of size sigma to smooth the image and finds the direction and magnitude 
of the gradient of the image at each pixel. Non-maximum suppression then detects the edge pixels. We 
write E(I; sigma) to denote the binary image of pixels so found [an edge pixel is assigned the value 1 and 
all others are 0]. To this initial edge image we perform a “dilation,” which generates a binary mask of 
pixels, D (E(I, sigma), t), that are within distance t of the already found edge pixels. We again apply the 
Canny edge detector, but now with a smaller sigma to obtain edges resolved to the smaller sigma, and 
only to those pixels in the mask D. Because Bergholm has proved that edges move at most 1 pixel if 
sigma changes by 0.5, no edges are lost. By repeating the dilation followed by edge detection until we 
reach a sigma of size 1 we obtain good edge localization. All computation is inherently local and it is 
straightforward to map the algorithm on the MPP. 

(2) Correlation Methods for Computing Matching in Image Pairs [J.P. Strong, H.K. Ramapriyan, 
Second Symposium on Super Computers, Santa Barbara (1987)] 

A generalization of the stereo problem is to compute local rotations and displacements for a pair of 
images — called the reference image and test image — that matches sub-regions in the image pairs. An 
application of this method is to obtain a vector flow field describing the motion of ice floes (the sub- 
regions) imaged by synthetic aperture radar from a spacecraft at different times. The algorithm contains 
an outer loop that steps through one-half-degree rotations, theta, of the test image relative to the refer- 
ence image. For each rotation, the algorithm computes a global cross correlation of the reference image 
relative to the rotated test image as a function of translation in the horizontal and vertical directions of 
the test image. The cross correlation is then thresholded to select a set of displacements corresponding to 
the largest global matches in the image pairs. Next, for each such displacement the test image is shifted 
by that displacement to obtain a rough match of sub-regions in the image pairs. To find the actual 
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sub-regions matched, a local cross correlation between a pair of small windows, say 11 x 11 (ina512x 
512 image), in the reference and test images is computed for each pixel in the reference image over a 
search area of a few pixels in the horizontal and vertical directions. Marking the pixels at which this 
“local correlation” exceeds a threshold generates a mask defining the matching sub-regions in the ref- 
erence and test images and further gives for each pixel in the sub-region in the reference image the total 
displacement (one part from the rotation plus global cross-correlation displacement and a smaller dis- 
placement from the local cross correlation) the matching pixel in the sub-region in the test image. While 
this algorithm is computationally intensive, it has been successfully mapped to the MPP and runs in 
times of the order of a few minutes. 

(3) Clustering of Point Sets by Neural Networks [B. Kamgar-Parsi, J.A. Gualtieri, J.E. Devaney, and 
B. Kamgar-Parsi, Proc. of the Second Symposium on the Frontiers of Massively Parallel Computing, 
George Mason Univ. (1988)] 

An important problem for early vision is to be able to partition N visual features, such as points in a 
two-dimensional space, into K clusters — in a way that those in a given cluster are more similar to each 
other than to the rest of the clusters. As there are approximately K**N/K! ways of partitioning the points 
among the K clusters, finding the best solution is beyond exhaustive search when N is large (say 128). 
This problem can be formulated as an optimization problem for which very good, but not necessarily 
optimal solutions can be found using a neural network. We have constructed a cost function to be mini- 
mized that is composed of a “syntax” term that enforces the constraint that each point must belong to 
only one cluster, and a “goodness of fit” term that measures the quality of the solution. Though the prob- 
lem involves a discrete optimization, by embedding it in the continuous space of an analog network we 
are able to perform a downhill search on the cost function that is more purposeful and effective than a 
search in a discrete space. Solutions are generated by starting the network from many randomly selected 
initial states and then taking the best solution from the ensemble of solutions found. The network is 
simulated on the MPP where we use the massive parallelism not only in solving the differential equa- 
tions that govern the evolution of the network, but also in starting the network from many initial states at 
once thus obtaining many solutions in one run. We obtain speedups of two to three orders of magnitude 
over serial implementations and further we obtain better quality solutions than conventional techniques 
such as K-means clustering. 

We see the neural network approach as being important for early vision in that 1) the methods of 
“programming” neural networks described here can be generalized to other problems such as determin- 
ing cluster shape, fitting more general features such as lines and parametric curves to visual data, and 
extending these results to higher dimensional spaces, and, 2) with the advent in the next few years of 
Analog VLSI devices, these algorithms can be straightforwardly mapped to silicon with potential 
speedups of up “nine orders of magnitude” over conventional serial implementations. 
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Intensity Dependent Spread Theory 


R. Holben 
Odetics, Inc. 


The Intensity Dependent Spread (IDS) procedure is an image-processing technique based on a model 
of the processing which occurs in the human visual system (1,2). IDS processing is relevant to many 
aspects of machine vision and image processing. For quantum limited images, it produces an ideal trade- 
off between spatial resolution and noise averaging, performs edge enhancement thus requiring only 
mean-crossing detection for the subsequent extraction of scene edges, and yields edge responses whose 
amplitudes are independent of scene illumination, depending only upon the ratio of the reflectance on 
the two sides of the edge. These properties suggest that the IDS process may provide significant band- 
width reduction while losing only minimal scene information when used as a preprocessor at or near the 
image plane. 
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Image Gathering, Coding, and Processing: End-to-End Optimization 
for Efficient and Robust Acquisition of Visual Information 


Friedrich O. Huck and Carl L. Fales 
NASA Langley Research Center 

We are concerned with the end-to-end performance of image gathering, coding, and processing. The 
applications range from high-resolution television to vision-based robotics, wherever the resolution, effi- 
ciency and robustness of visual information acquisition and processing are critical. For our presentation 
at this workshop, it is convenient to divide research activities into the following two overlapping areas: 

1) The development of focal-plane processing techniques and technology to effectively combine 
image gathering with coding. The emphasis is on low-level vision processing akin to the retinal process- 
ing in human vision. Our approach includes the familiar Laplacian pyramid, the new intensity-dependent 
spatial summation, and parallel sensing/processing networks. Three-dimensional image gathering is 
attained by combining laser ranging with sensor-array imaging. This work is summarized in the follow- 
ing five abstracts by T. Comsweet, G. Westrom, E. Kurrasch and R. Holben, R. Holben and 

G. Westrom, and D. Coon and A. Perera. 

2) The rigorous extension of information theory and optimal filtering to visual information 
acquisition and processing. The goal is to provide a comprehensive methodology for quantitatively 
assessing the end-to-end performance of image gathering, coding, and processing. Information theory 
allows us to establish upper limits on the visual information which can be acquired within given con- 
straints, and optimal filtering allows us to establish upper limits on the performance that can be attained 
for specific tasks, even if these tasks require adaptive or interactive processing. This work is summarized 
in the remainder of this abstract. 

The performance of (digital) image-gathering systems is constrained by the spatial-frequency 
response of optical apertures, the sampling passband of photon-detection mechanisms, and the noise 
generated by photon detection and analog-to-digital conversion. Biophysical limitations have imposed 
similar constraints on natural vision. Visual information is inevitably lost in both image gathering and 
low-level vision by aliasing, blurring, and noise. It is therefore no longer permissible to assume suffi- 
cient sampling as Shannon and Wiener could do in their classical works, respectively, on communication 
theory and optimal filtering for time-varying signals. Nevertheless, the digital processing algorithms (for 
image restoration, edge enhancement, etc.) found in the currently prevailing literature assume sufficient 
sampling, whereas image-gathering systems are ordinarily designed to permit considerable insufficient 
sampling. This fundamental difference between assumption and reality has caused unnecessary limita- 
tions in the performance of digital image gathering, coding, and processing. It has also led to unreliable 
conclusions about the correct design of image-gathering systems for visual information processing (as 
opposed to image reconstruction without processing, e.g., commercial television) and about the actual 
performance of image-coding schemes for tasks which involve digital image processing. 

Our analyses so far have shown that the combined process of image gathering and optimal process- 
ing can be treated as a communication channel if (and only if) the image-gathering degradations are 
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correctly accounted for. Correcdy restored images gain significantly in fidelity (similarity to target), 
resolution (minimum discernible detail), sharpness (contrast between large areas), and clarity (absence 
of visible artifacts). These improvements in visual quality are obtained solely by the correct end-to-end 
optimization without increase in either data transmission or processing. Similar improvements can also 
be made in the resolution and accuracy of edge detection. Furthermore, if we implement the edge 
enhancement with focal-plane processing by properly combining optical response with lateral inhibition, 
it is possible to reduce data processing and transmission requirements and to improve robustness to 
noise. These results have encouraged us to extend our analyses to various image-coding schemes and the 
associated image restoration and feature-extraction algorithms. 
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Hybrid Vision Activities at NASA Johnson Space Center 


Richard D. Juday 
NASA Johnson Space Center 

NASA’s Johnson Space Center in Houston, Texas, is active in several aspects of hybrid image 
processing. (The term “hybrid image processing” refers to a system that combines digital and photonic 
processing.) Our major thrusts are autonomous space operations such as planetary landing, servicing, 
and rendezvous and docking. By processing images in non-Cartesian geometries to achieve shift 
invariance to canonical distortions, we use certain aspects of the human visual system for machine 
vision. That technology flow is bidirectional; we are investigating the possible utility of video-rate 
coordinate transformations for human low-vision patients. Man-in-the-loop teleoperations are also 
supported by the use of video-rate image-coordinate transformations, as we plan to use bandwidth 
compression tailored to the varying spatial acuity of the human operator. 

Technological elements being developed in the program include upgraded spatial light modulators, 
real-time coordinate transformations in video imagery, synthetic filters that robustly allow estimation of 
object pose parameters, convolutionally blurred filters that have continuously selectable “invariance” to 
such image changes as magnification and rotation, and optimization of optical correlation done with 
spatial light modulators that have limited range and couple both phase and amplitude in their response. 

Liaisons of varying degree of activity level and maturity exist between JSC and the Army (the 
Missile Command at the Redstone Arsenal — SLM and filter development, the Human Engineering 
Laboratory at the Aberdeen Proving Ground — video image compression for teleoperations), and the Air 
Force (RADC, Hanscom Field — exchanges relating to phase-mostly filter theory). JSC is promulgating a 
NASA participation in the DARPA/MICOM/RADC optical correlator development, to conform the 
results of that hardware development for use in space vision. 
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Human Motion Perception: Higher-Order Organization 


Mary K. Kaiser 
NASA Ames Research Center 
Dennis R. Proffitt 
University of Virginia 


This talk presents an overview of higher-order motion perception and organization. We argue that 
motion is sufficient to fully specify a number of environmental properties, including: depth order, three- 
dimensional form, object displacement, and dynamics. A grammar of motion perception is proposed; 
applications of this work for display design are discussed. 


Goals of Research: 

To define the competencies, limitations, and biases in human perception of motion events. 


Application: 

To design dynamic displays which exploit operators’ competencies and compensate for limitations 

and biases. 

What kinds of information can be specified via motion? 

• Surface segregation 

• Three-dimensional form 

• Object displacement 

• Dynamics 

Surface Segregation (Depth Order Specification) 

• Static depictions must rely on cues, conventions, or appeals to expectations: contrast, occlusion, 
familiarity, shading 

• Motion, in and of itself, is sufficient to fully specify depth order (even if edge information is 
deleted) 

Three-Dimensional Form 

• An indefinite number of three-dimensional distal objects could produce a given two-dimensional 
pattern. 

• Form specification through rotation resolves ambiguity (assuming rigid object): Kinetic Depth 
Effect. 
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• Perspective information (e.g., foreshortening of lines) not required; works with point-light display. 

Object Displacement 

• Motion of objects relative to observer virtually impossible to depict with static symbols and 
conventions. 

• It is necessary to consider how the perceptual system parses object motion. 

Dynamics 

• Kinematics can specify underlying kinetics, at least to classes of solutions (e.g., relative masses of 
colliding objects). 

• Observers demonstrate appreciations of dynamic properties even for events about which they hold 
erroneous beliefs. 

Current Research 

• Determine limits of perceptual competence (e.g., angular systems) 

• Differentiate observers' ability to extract kinematic information vs. ability to perform dynamic 
analysis 

• Develop taxonomy of event complexity (particle vs. extended body, dimensionality, dynamical 
feature analysis) 
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Two-Dimensional Shape Recognition Using 
Sparse Distributed Memory 


Pentti Kanerva and Bruno Olshausen 

Research Institute for Advanced Computer Science 
NASA Ames Research Center 

We propose a method for recognizing two-dimensional shapes (hand-drawn characters, for example) 
with an associative memory. The methods consists of two stages: First, the image is preprocessed to 
extract tangents to the contour of the shape. Second, the set of tangents is converted to a long bit string 
for recognition with sparse distributed memory (SDM). [SDM provides a simple, massively parallel 
architecture for an associative memory. Long bit vectors (256-1000 bits, for example) serve as both data 
and addresses to the memory, and patterns are grouped or classified according to similarity in Hamming 
distance. See Kanerva (1988) for details on SDM, and Keeler (1988) for a comparison to Hopfield nets.] 

At the moment, tangents are extracted in a simple manner by progressively blurring the image and 
then using a Canny-type edge detector (Canny, 1986) to find edges at each stage of blurring. This results 
in a grid of tangents, such as shown in Figure 1 for the letter A. While the technique used for obtaining 
the tangents is at present rather ad hoc, we plan to adopt an existing framework for extracting edge ori- 
entation information over a variety of resolutions, such as suggested by Watson (1987, 1983), Marr and 
Hildreth (1980), or Canny (1986). 


The grid of tangent is converted to a long bit pattern by encoding the orientation at each point with 
three bits. The three-bit encodings for each orientation are chosen in such a way that the Hamming dis- 
tance between code words is related to angular distance, as shown in Table 1. The encodings at all the 
grid points are then concatenated into a long bit pattern, as shown in Figure 2. This bit pattern then 
serves as a reference address and/or data word for SDM. 

The main advantages of this approach are that 1) SDM is capable of searching among many stored 
patterns in parallel, 2) SDM corrects for noise in the address and data, and 3) the features obtained from 
the preprocessing stage are chosen and encoded in such a way that bit patterns corresponding to percep- 
tually similar shapes (as judged by humans) are close to each other in Hamming distance. We are 
currently running simulations of this method on a SUN 3/60. 
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The Intensity Dependent Spread Model and Color Constancy 


Ellie Kurrasch 
Odetics, Inc. 

Odetics is investigating the use of the intensity dependent spread (IDS) model for determining color 
constancy. Object segmentation is performed effortlessly by the human visual systems, but creating 
computer vision that takes an image as input and performs object identification on the basis of color has 
some difficulties. The unknown aspects of the light illuminating a scene in space or anywhere can seri- 
ously interfere with the use of color for object identification. The color of an image depends not only on 
the physical characteristics of the object, but also on the wavelength composition of the incident illumi- 
nation. IDS processing provides the extraction of edges and of reflectance changes across edges, inde- 
pendent of variations in scene illumination. IDS depends solely on the ratio of the reflectances on the 
two sides of the edge. We are in the process of using IDS to recover the reflectance image. 
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Filling in the Retinal Image 


James Larimer 
Ames Research Center 
Thomas Piantanida 
SRI International 

The optics of the eye form an image on a surface at the back of the eyeball called the retina. The 
retina contains the photoreceptors that sample the image and convert it into a neural signal. The spacing 
of the photoreceptors in the retina is not uniform and varies with retinal locus. The central retinal field, 
called the macula, is densely packed with photoreceptors. The packing density falls off rapidly as a func- 
tion of retinal eccentricity with respect to the macular region and there are regions in which there are no 
photoreceptors at all. The retinal regions without photoreceptors are called blind spots or scotomas. 

The neural transformations which convert retinal image signals into percepts fills in the gaps and 
regularizes the inhomogeneities of the retinal photoreceptor sampling mosaic. The filling in mechanism 
is so powerful that we are generally not aware of our physiological blind spot, where the nerve head 
exits the eyeball, or other naturally occurring scotomas such as the central field loss that occurs during 
night vision. Individuals with pathological scotomas are also generally unaware of the field losses that 
result from the pathology. 

The filling-in mechanism plays an important role in understanding visual performance. For example, 
a person with a peripheral field loss is usually unaware of the loss and subjectively believes that his or 
her vision is as good as ever, yet his or her performance in a task such as driving can be severely 
impaired. 

The filling-in mechanism is not well understood. A systematic collaborative research program at the 
Ames Research Center and SRI in Menlo Park, California, has been designed to explore this mechanism. 
It has been known for some time that when an image boundary is stabilized on the retina the boundary is 
not perceived. Using image-stabilization techniques, we have been able to show that retinally local adap- 
tation (the control of sensitivity) can be separated from more central neural effects which control the 
appearance of fields. In particular, we have shown that the perceived fields which are in fact different 
from the image on the retina due to filling-in control some aspects of performance and not others. We 
have linked these mechanisms to putative mechanisms of color coding and color constancy. 
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A3I Visibility Modeling Project 


James Larimer 
Ames Research Center 
Aries Arditi 

New York Association for the Blind 
James Bergen 

David Samoff Research Laboratories 
Norman Badler 

The University of Pennsylvania 

The Army-NASA Aircrew Aircraft Integration program is supporting a joint project to build a visi- 
bility computer-aided design (CAD) tool. The principal participants in the project are Dr. James Larimer 
of the Ames Research Center, Dr. Aries Arditi of the Research Department of the New York Association 
for the Blind, Dr. James Bergen of the SRI Samoff Research Laboratories, and Dr. Norman Badler of 
the University of Pennsylvania. 

CAD has become an essential tool in modem engineering applications. CAD tools are used to create 
engineering drawings and to evaluate potential designs before they are physically realized. The visibility 
CAD tool will provide the design engineer with a tool to aid in the location and specification of win- 
dows, displays, and control in crewstations. In an aircraft cockpit the location of instruments and the 
emissive and reflective characteristics of the surfaces must be determined to assure adequate aircrew 
performance. For example, how big should letters be on a display to assure that they can always be read 
without error? How much contrast should the symbols have with the background? How bright should 
emissive displays be so that they will not be “washed out” by bright sunlight? 

The visibility CAD tool will allow the designer to ask and answer many of these questions in the 
context of a three-dimensional graphical representation of the cockpit. The graphic representation of the 
cockpit is a geometrically valid model of the cockpit design. A graphic model of a pilot, called the pilot 
manikin, can be placed naturalistically in the cockpit model. The visibility tool has the capability of 
mapping the cockpit surfaces and other objects modeled in this graphic design space onto the simulated 
pilot's retinas for a given visual fixation. Moreover, the binocular retinal “footprint” can be mapped onto 
the environmental surfaces implied by the cockpit design and modeled objects in the graphic space. 
These capabilities and the sequential application of them permit the designer to estimate the required 
size and contrast of letters, numbers and symbols to be used by the instruments. Moreover, the system 
will per mit the application of human visual processing models to predict the legibility of textual mate- 
rials in the displays. Models of the ambient lighting and the adaptation state of the simulated pilot are 
being adapted to permit predictions of visibility and legibility over a large variety of conditions. 
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Motion Detection in Astronomical and Ice Floe Images 


M. Manohar, H. K. Ramapriyan, and J. P. Strong 
NASA Goddard Space Flight Center 

Two approaches are presented for establishing correspondence between small areas in pairs of 
successive images for motion detection. The first one, based on local correlation, is used on a pair of 
successive Voyager images of the Jupiter which differ mainly in locally variable translations. This 
algorithm is implemented on a sequential machine (VAX 780) as well as the Massively Parallel Pro- 
cessor (MPP). In the case of the sequential algorithm, the pixel correspondence or match is computed on 
a sparse grid of points using nonoverlapping windows (typically 1 1 x 11) by local correlations over a 
predetermined search area. The displacement of the corresponding pixels in the two images is called the 
disparities to cubic surfaces. The disparities at points where the error between the computed values and 
the surface values exceeds a particular threshold are replaced by the surface values. A bilinear interpola- 
tion is then used to estimate disparities at all other pixels between the grid points. When this algoriUim 
was applied at the red spot in the Jupiter image, the rotating velocity Field of the storm was determined. 

The computation required for this algorithm is proportional to the area of the image and is about one- 
half hour for a 128 x 128 image with local window of size 1 1 x 1 1 and search area of 1 1 x 1 1 . The par- 
allel implementation on the MPP is exacdy same except that correspondences are established at every 
point rather than on a sparse grid of points. Thus this implementation needs no interpolation step. The 
results obtained in both cases are comparable for this image. However for images which are not smooth, 
the implementation on the MPP giving results at each pixel is more accurate. The time taken on the MPP 
is about 10 seconds. 

The second method of motion detection is applicable to pairs of images in which corresponding 
areas can experience considerable translation as well as rotation. Ice floe images obtained from the syn- 
thetic aperture radar (SAR) instrument flown onboard the Seasat spacecraft belong to this class. The 
time interval between two successive images of a given region was as much as three days. During this 
period, large translations and rotations of ice floes can occur. Therefore, conventional local correlation 
techniques which perform searches in a small neighborhood to detect translated features have a very 
small chance of success. To account for large translations and rotations, it is necessary to perform large 
area searches in a three-dimensional space (two translational and one rotational). This makes conven- 
tional correlation techniques computationally intensive even on a high-speed parallel computer such as 
the MPP. A parallel algorithm has been developed and implemented on the MPP for locating corre- 
sponding objects based on their translationally and rotationally invariant features. The algorithm first 
approximates the edges in the images by polygons or sets of connected straight-line segments. Each such 
“edge structure” is then reduced to a “seed point.” Associated with each seed point are the descriptions 
(lengths, orientations, and sequence numbers) of the lines constituting the corresponding edge structure. 
A parallel matching algorithm is used to match packed arrays of such descriptions to identify corre- 
sponding seed points in the two images. The matching algorithm is designed such that fragmentation and 
merging of ice floes are taken into account by accepting partial matches. The technique has been demon- 
strated to work on synthetic test patterns and real image pairs from Seasat in times ranging from 0.6 to 
0.7 seconds for 128 x 128 images. 
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Passive Navigation Using Image Irradiance Tracking 


P. K. A. Menon 1 
Georgia Institute of Technology 

Rotorcraft operating at low altitudes require navigational schemes for detecting terrain and obstacles. 
Due to the nature of the missions to be accomplished and available power onboard, a passive navigation 
scheme is desirable in this situation. This paper describes the development of a passive navigation 
scheme using optical image sequences and vehicle motion variables from an onboard inertial navigation 
scheme. This approach combines the geometric properties of perspective projection and a feedback 
irradiance tracking scheme at each pixel in the image to determine the range to various objects within 
the field-of-view. Derivation of the numerical algorithm and simulation results are given. Due to the 
feedback nature of the implementation, the computational scheme is robust. Other applications of the 
proposed approach include navigation for autonomous planetary rovers and telerobots. 

This research was supported under NASA grant NAG2-463. 
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Factors Affecting the Perception of Transparent Motion 


Jeffrey B. Mulligan 
NASA Ames Research Center 

It is possible to create a perception of transparency by combining patterns having different motions. 
Two particular combination rules, have specific interpretations in terms of physical phenomena: additive 
(specular reflection) and multiplicative (shadow illumination). Arbitrary combination rules applied to 
random patterns generate percepts in which the motions of the two patterns are visible, but have super- 
imposed noise. It is also possible to combine the patterns (using an exclusive-OR rule) so that only noise 
is visible. Within a one-dimensional family of combination rules which include addition and multiplica- 
tion, there is a range where smooth motions are seen with no superimposed noise; this range is centered 
about the additive combination. This result suggests that the motion system deals with a linear represen- 
tation of luminance, and is consistent with the analysis of motion by linear sensors. 

This research gives tentative validation the use in beam splitters (which combine images additively) 
in the construction of heads-up aviation displays. Further work is needed to determine if the superiority 
of additive combination generalizes to the case of full-color imagery (there are results in the literature 
suggesting that subtractive color mixture yields the best legibility of overlapping alphanumerics). 
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Photonic Processing at NASA Ames Research Center 


Ellen Ochoa and Max Reid 
NASA Ames Research Center 

The Photonic Processing group is engaged in applied research on optical processors in support of the 
Ames vision to lead the development of autonomous intelligent systems. Optical processors, in conjunc- 
tion with numeric and symbolic processors, are needed to provide the powerful processing capability 
that is required for many future agency missions. The research program emphasizes application of 
analog optical processing, where free-space propagation between components allows natural implemen- 
tations of algorithms requiring a large degree of parallel computation. Special consideration is given in 
the Ames program to the integration of optical processors into larger, heterogeneous computational 
systems. Demonstration of the effective integration of optical processors within a broader knowledge- 
based system is essential to evaluate their potential for dependable operation in an autonomous 
environment such as space. 

The Ames Photonics program is currently addressing several areas of interest. One of the efforts is to 
develop an optical correlator system with two programmable spatial light modulators (SLMs) to perform 
distortion invariant pattern recognition. Part of this work has been to develop a new type of filter to be 
placed in the spectral plane that uses information in the design procedure about the particular SLM on 
which it will be implemented. Laboratory work is aimed at the verification of this filter's performance. 
The SLM device used in our laboratory is an electronically-addressable magneto-optic array known as a 
SIGHT-MOD. An electronic controller for the SIGHT-MOD has been designed, built, and is currently 
being tested; the controller will be able to store 100 filters used for object recognition and rapidly 
address the device with a desired sequence of filters. This high-speed I/O capability is a key step in plans 
to integrate the optical processor with a knowledge-based system for image recognition and 
classification. 

Another area of research is optical neural networks, also for use in distortion-invariant pattern 
recognition. Most promising of the models investigated are higher-order neural networks; to date, a 
small third-order net which distinguishes two objects regardless of size, position, or rotation has been 
demonstrated in software. The large number of interconnections needed in these architectures leads to 
consideration of optical implementations. Experimental work on developing an optical neural network 
will involve evaluating holographic implementations of weighted network connections, as well as testing 
optical or hybrid optical/electronic implementations of thresholding units to realize neuronic elements. 

Optical matrix processors are being investigated for implementing neural net techniques to perform 
multispectral data analysis. The problem is to sort out three-dimensional (x,y, lambda) data and deter- 
mine for every pixel in a scene all minerals present, amount of each, and estimate spectrum of unknown 
elements. This type of analysis is needed for site selection and sample analysis on planetary explorations 
as well as many types of astronomical and earth sensing data. 
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Sparse Distributed Memory Overview 


Mike Raugh 

Research Institute for Advanced Computer Science 
NASA Ames Research Center 

One of NASA's grand challenges is to build autonomous machines and systems that are capable of 
learning to perform tasks too tedious, or in places too remote and too hostile, for humans. The goal of 
the Learning Systems Division of RIACS is to find new approaches to autonomous systems based upon 
sound mathematical and engineering principles and the need to know how information processing is 
organized in animals, and to test the applicability of these new approaches to the grand challenge. The 
program includes the development of theory, implementations in software and hardware, and 
explorations of potential areas for applications. 

There are two projects in the Learning Systems Division — Sparse Distributed Memory (SDM) and 
Bayesian Learning. My talk gave an overview of the research in the SDM project. 

Now in its third year, the Sparse Distributed Memory (SDM) project is investigating the theory and 
applications of massively parallel computing architecture, called sparse distributed memory, that will 
support the storage and retrieval of sensory and motor patterns characteristic of autonomous systems. 
The immediate objectives of the project are centered in studies of the memory itself and in the use of the 
memory to solve problems in speech, vision, and robotics. Investigation of methods for encoding sen- 
sory data is an important part of the research. Examples of NASA missions that may benefit from this 
work are Space Station, planetary rovers, and solar exploration. Sparse distributed memory offers 
promising technology for systems that must learn through experience and be capable of adapting to new 
circumstances, and for operating any large complex system requiring automatic monitoring and control. 
This work, which is conducted primarily within RIACS, includes collaborations with NASA codes FL 
and RI, Apple Computer Corporation, Hewlett-Packard Corporation, MCC, Stanford University, and 
other research groups to be determined. 

Sparse distributed memory is a massively parallel architecture motivated by efforts to understand 
how the human brain works, given that the brain comprises billions of sparsely interconnected neurons, 
and by the desire to build machines capable of similar behavior. Sparse distributed memory is an asso- 
ciative memory, able to retrieve information from cues that only partially match patterns stored in the 
memory. It is able to store long temporal sequences derived from the behavior of a complex system, 
such as progressive records of the system's sensory data and correlated records of the system's motor 
controls. Using its records of successful behavior in the past, sparse distributed memory can be used to 
recognize a similar circumstance in the present and to “predict” appropriate responses. Unlike numerical 
and symbolic computers, sparse distributed memory is a pattern computer, designed to process very 
large patterns formulated as bit strings that may be thousands of bits long. Each such bit string can serve 
as both content and address within the memory. Our project is concerned with research into aspects of 
sparse distributed memory that will enable us to evaluate and someday build autonomous systems based 
upon sparse distributed memory. 
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For the coming three years we have proposed research in four general areas: theory and design of 
SDM architectures, representation of sensory and motor data as bit patterns suitable for SDM, organi- 
zation of SDM-based autonomous systems, and exploration of important domains of application. A 
major objective of our research is to explore the feasibility of SDM-based systems in applications such 
as vision processing, language processing, robotics and motor systems, and information retrieval. Each 
of the named areas will involve development of theory, simulations on appropriate computers such as 
the CM-2, and implementations on a digital prototype of SDM. 
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Algorithms and Architectures for Robot Vision 


Paul S. Schenker 
Jet Propulsion Laboratory 

The author has previously conducted research in vision devices, algorithms, and architectures. Most 
of this work has addressed problems in scene perception and object recognition in support of autono- 
mous robotics. A number of novel algorithms have resulted, including pyramid image analysis using 
contrast-normalized feature extraction [1], scale-rotation-aspect invariant image analysis using polar- 
exponential-grid representation [2,3], and high-speed image segmentation using multi-resolution 
stochastic search techniques [4]. Other efforts have included development of a multi-sensor fusion 
approach to scene analysis, and the development of a real-time VLSI machine vision architecture [5,6]. 

The scope of our current work is to develop practical sensing implementations for robots operating 
in complex, partially unstructured environments [7,8]. A focus in this work is to develop object models 
and estimation techniques which are specific to requirements of robot locomotion, approach and avoid- 
ance, and grasp and manipulation. Such problems have to date received limited attention in either com- 
puter or human vision — in essence, asking not only how perception is in general modeled, but also what 
is the functional purpose of its underlying representations [9]. As in the past [1,2], we are drawing on 
ideas from both the psychological and machine vision literature. Of particular interest to us is developing 
3-D shape and motion estimates for complex objects when given only partial and uncertain information 
and when such information is incrementally accrued over time. Our current studies consider the use of 
surface motion, contour, and texture information, with the longer range goal of developing a fused 
sensing strategy based on these sources and others. 
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Computer Vision Techniques for Rotorcraft Low Altitude Flight 


Banavar Sridhar 
NASA Ames Research Center 

Rotorcraft operating in high-threat environments fly close to the Earth’s surface to utilize surround- 
ing terrain, vegetation, or manmade objects to minimize the risk of being detected by an enemy. Increas- 
ing levels of concealment are achieved by adopting different tactics during low-altitude flight. Rotorcraft 
employ three tactics during low-altitude flight: low-level, contour, and nap-of-the-Earth (NOE). The key 
feature distinguishing the NOE mode from the other two modes is that the whole rotorcraft, including 
the main rotor, is below tree-top whenever possible. This leads to the use of lateral maneuvers for avoid- 
ing obstacles, which in fact constitutes the means for concealment. The piloting of the rotorcraft is at 
best a very demanding task and the pilot will need help from onboard automation tools in order to 
devote more time to mission-related activities. The development of an automation tool which has the 
potential to detect obstacles in the rotorcraft flight path, warn the crew, and interact with the guidance 
system to avoid detected obstacles, presents challenging problems. 

This presentation describes research which applies techniques from computer vision to automation of 
rotorcraft navigation. The effort emphasizes the development of a methodology for detecting the ranges 
to obstacles in the region of interest based on the maximum utilization of passive sensors. The range 
map derived from the obstacle-detection approach can be used as obstacle data for the obstacle avoid- 
ance in an automatic guidance system and as advisory display to the pilot. The lack of suitable flight 
imagery data presents a problem in the verification of concepts for obstacle detection. This problem is 
being addressed by the development of an adequate flight database and by preprocessing of currently 
available flight imagery. The presentation concludes with some comments on future work and how 
research in this area relates to the guidance of other autonomous vehicles. For further details on the work 
reported here please refer to the following list of papers. 
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Kalman Filter Based Range Estimation for Autonomous Navigation Using Imaging Sensors 


Banavar Sridhar 
NASA Ames Research Center 

Rotorcraft operating in high-threat environments fly close to the Earth’s surface to utilize 
surrounding terrain, vegetation, or man-made objects to minimize the risk of being detected by the 
enemy. The piloting of the rotorcraft is at best a very demanding task and the pilot will need help from 
on-board automation tools in order to devote more time to mission-related activities. The development 
of an automation tool, which has the potential to detect obstacles in the rotorcraft flight path, warn the 
crew, and interact with the guidance system to avoid detected obstacles, presents challenging problems 
in control, computer vision and image understanding. 

The planning of rotorcraft low-altitude missions can be divided into far-field planning and near-field 
planning (Cheng and Sridhar, 1988). Far-field planning involves the selection of goals and a nominal 
trajectory between the goals. Far-field planning is based on a priori information and requires a detailed 
map of the local terrain. However, the database for even the best surveyed landscape will not have 
adequate resolution to indicate objects such as trees, buildings, wires and transmission towers. This 
information has to be acquired using an onboard sensor and integrated into the navigation/guidance 
system to modify the nominal trajectory of the rotorcraft. Initially, passive imaging sensors such as 
forward looking infrared (FLIR) and low-light-level-television (LLLTV) will be considered for 
detection to assess the limitation of passive methods. The two basic requirements for obstacle avoidance 
are detection and range estimation of the objects from the current rotorcraft position. 

There are many approaches to the estimation of range using a sequence of images. The approach 
used in this analysis differs from previous methods in two significant ways: (i) we do not attempt to 
estimate the rotorcraft’s motion from the images, and (ii) our interest lies in recursive algorithms. The 
rotorcraft parameters (position, translational velocity, rotational velocity and attitude) are assumed to be 
computed using an onboard inertial navigation system. Given a sequence of images, using image-object 
differential equations, a Kalman filter (Sridhar and Phatak, 1988) can be used to estimate both the 
relative coordinates and the Earth coordinates of objects on the ground. The Kalman filter can also be 
used in a predictive mode to track features in the images, leading to a significant reduction of search 
effort in the feature extraction step of the algorithm. The performance of three different Kalman filters 
for different rotorcraft maneuvers were examined in Sridhar and Phatak, 1988. This previous study did 
not, however, include the processing of real images. The purpose of this paper is to summarize early 
results obtained in extending the Kalman filter for use with actual image sequences. These tests were 
restricted to linear motion in order to reduce the image processing requirements. The experience gained 
from the application of this algorithm to real images is very valuable and is a necessary step before 
proceeding to the estimation of range during low-altitude curvilinear flight. 

We have presented a simple recursive method to estimate range to objects using a sequence of 
images. The method produces good range estimates using real images in a laboratory set up and needs to 
be evaluated further using several different image sequences to test its robustness. The feature 
generation part of the algorithm requires further refinement on the strategies to limit the number of 
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features (Sridhar and Phatak, 1989). The extension of the work reported here to curvilinear flight may 
require the use of the extended Kalman filter. 


The research reported in this paper is part of an ongoing effort at NASA Ames to develop 
technologies for the automation of rotorcraft low-altitude flight. The object detection and range 
estimation algorithms discussed are quite general and have potential applications in robotics and 
autonomous navigation of vehicles. In addition to these feature-based algorithms, there are parallel 
efforts to investigate field-based techniques for the same range estimation applications (Menon and 
Sridhar, 1989; KendaU and Jacobi, 1989). 
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Instrumentation and Robotic Image Processing 
Using Top-Down Model Control 


Lawrence Stark, Barbara Mills, An H. Nguyen, Huy X. Ngo 
Telerobotics Unit 

University of California at Berkeley 

A top-down image processing scheme is described. A three-dimensional model of a robotic working 
environment, with robot manipulators, workpieces, cameras, and on-the-scene visual enhancements is 
employed to control and direct the image processing, so that rapid, robust algorithms act in an efficient 
manner to continually update the model. Only the model parameters are communicated, so that savings 
in bandwidth are achieved. This image compression by modeling is especially important for control of 
space telerobotics. 

The background for this scheme lies in an hypothesis of human vision put forward by the senior 
author and colleagues almost 20 years ago — the Scanpath Theory. Evidence was obtained that repetitive 
sequences of saccadic eye movements, the scanpath, acted as the checking phase of visual pattern recog- 
nition. Further evidence was obtained that the scanpaths were apparently generated by a cognitive model 
and not directly by the visual image. This top-down theory of human vision was generalized in some 
sense to the ‘frame’ in artificial intelligence. 

Another source of our concept arose from bioengineering instrumentation for measuring the pupil 
and eye movements with infrared video cameras and special-purpose hardware. Since the image avail- 
able to the instrument camera was well-defined, a model of the view could be used to direct the image 
processing algorithms to particular regions of interest and to particular parameters such as the diameter 
of the pupil or the centroid of the comeal reflection. Thus, robust, rapid image processing could be 
obtained under control by the known top-down picture. 
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Computer Vision Research at Marshall Space Flight Center 


Frank L. Vinz 

Marshall Space Flight Center 

Orbital docking, inspection, and servicing are operations which have the potential for capability 
enhancement as well as cost reduction for space operations by the application of computer vision 
technology. Research at MSFC has been a natural outgrowth of orbital docking simulations for remote 
manually controlled vehicles such as the Teleoperator Retrieval System and the Orbital Maneuvering 
Vehicle (OMV). Baseline design of the OMV dictates teleoperator control from a ground station. This 
necessitates a high data-rate communication network and results in several seconds of time delay. 
Operational costs and vehicle control difficulties could be alleviated by an autonomous or semi- 
autonomous control system onboard the OMV which would be based on a computer vision system 
having capability to recognize video images in real time. A concept under development at MSFC with 
these attributes is based on syntactic pattern recognition. It uses tree graphs for rapid recognition of 
binary images of known orbiting target vehicles. This technique and others being investigated at MSFC 
will be evaluated in realistic conditions by the use of MSFC orbital docking simulators. 

Computer vision is also being applied at MSFC as part of the supporting development for Work 
Package One of Space Station Freedom. The objective of this is to automate routine tasks such as 
locating, fetching, storing, adjusting, or monitoring experiments, thereby relieving crewmen for more 
demanding tasks. This vision system would be used in conjunction with a robot arm planned for use in 
the laboratory module. This vision system would also relieve accuracy requirements for instrumentation 
of arm positioning. One approach for this has been contracted to researchers at the University of 
Alabama in Huntsville who are developing a real-time expert vision system. This expert system uses 
knowledge to achieve a high performance level at every stage of an image-to-decision paradigm. 
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Stanford/NASA-Ames Center of Excellence 
in Model-Based Human Performance 


Brian A. Wandell 
Stanford University 

The human operator plays a critical role in many aeronautic and astronautic missions. The Stanford/ 
NASA-Ames Center of Excellence in Model-Based Human Performance (COE) was initiated in 1985 to 
further our understanding of the performance capabilities and performance limits of the human compo- 
nent of aeronautic and astronautic projects. Support from the COE is devoted to those areas of experi- 
mental and theoretical work designed to summarize and explain human performance by developing 
computable performance models. Our ultimate goal is to make these computable models available to 
other scientists for use in design and evaluation of aeronautic and astronautic instrumentation. 

The COE currently provides a portion of the research support of four principal investigators (Pavel, 
Rumelhart, Shepard, and Wandell). During the last three years more than ten graduate students and post- 
doctoral students have participated in the research supported by the COE. The research interests of the 
participating faculty members and students range across the areas of vision science, cognitive science, 
and neural networks. 

Within vision science, two topics have received particular attention. First, we have done extensive 
work analyzing the human ability to recognize object color relatively independent of the spectral power 
distribution of the ambient lighting (color constancy). The COE has supported a number of research 
papers in this area, as well as the development of a substantial data base of surface reflectance functions, 
ambient illumination functions, and an associated software package for rendering and analyzing image 
data with respect to these spectral functions. The software and data base of reflectances have been 
distributed to laboratories around the world. 

Second, the COE has supported new empirical studies on the problem of selecting colors for visual 
display equipment, such as CRTs, to enhance human performance in discrimination and recognition 
tasks. Classic color metric work, which is often used to define color specifications on visual display 
equipment, was performed using tasks that are inappropriate for the viewing conditions experienced by 
pilots. At the suggestion of our colleagues in the Vision Group at NASA-Ames, we have conducted new 
experiments that extend the range of measurement conditions to bring them closer into alignment with 
the viewing conditions encountered in flight. 
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Andrew B. Watson 

Principal Scientist, Human Interface Research Branch 
NASA Ames Research Center 

Computational Models of Early Vision- A major goal of our research group is to develop 
mathematical and computational models of early human vision. These models are valuable in the 
prediction of human performance, in the design of visual coding schemes and displays, and in robotic 
vision. To date we have models of retinal sampling, spatial processing in visual cortex, contrast 
sensitivity, and motion processing. 

Image Coding- Based on our models of early human vision, we have developed several schemes for 
efficient coding and compression of monochrome and color images. These are pyramid schemes that 
decompose the image into features that vary in location, size, orientation, and phase. To determine the 
perceptual fidelity of these codes, we have developed novel human testing methods that have received 
considerable attention in the research community. 

Motion Processing- Visual motion processing is an important capability in both man and machine. 
In both cases, the challenge is to convert a time-sequence of images into descriptions of image motion, 
and ultimately into descriptions of object motions. We have constructed models of human visual motion 
processing based on physiological and psychophysical data, and have tested these models through simu- 
lation and human experiments. We have also explored the application of these biological algorithms to 
applications in automated guidance of rotorcraft and autonomous landing of spacecraft. 

Neural Networks- The human visual system comprises layers of neural networks which sample, 
process, code, and recognize images. Understanding these networks is a valuable means of under- 
standing human vision and of designing autonomous vision systems. We have developed networks for 
inhomogeneous image sampling, for pyramid coding of images, for automatic geometrical correction of 
disordered samples, and for removal of motion artifacts from unstable cameras. We are collaborating 
with the Research Institute for Advanced Computer Science (RIACS) on networks for automatic visual 
pattern recognition. 

Human Psychophysics- To determine fundamental aspects of human visual performance and to 
validate our computational models we maintain a vigorous program of psychophysical experiments on 
human observers. Currently this work emphasizes perception of coding artifacts, motion perception, and 
spatial scale of visual functions. In collaboration with Stanford, we are testing fundamental color vision 
capacities. 
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Pyramid Image Codes 


Andrew B. Watson 

Aerospace Human Factors Research Division 
NASA Ames Research Center 

All vision systems, both human and machine, transform the spatial image into a coded representa- 
tion. Particular codes may be optimized for efficiency or to extract useful image features. We have 
explored image codes based on primary visual cortex in man and other primates. Understanding these 
codes will advance the art in image coding, autonomous vision, and computational human factors. 

In cortex, imagery is coded by features that vary in size , orientation, and position. We have devised 
a mathematical model of this transformation, called the Hexagonal oriented Orthogonal quadrature Pyra- 
mid (HOP). In a pyramid code features are segregated by size into layers, with fewer features in the 
layers devoted to large features. Pyramid schemes provide scale invariance, and are useful for coarse-to- 
fine searching and for progressive transmission of images. 

The HOP Pyramid is novel in three respects: 1) it uses a hexagonal pixel lattice, 2) it uses oriented 
features, and 3) it accurately models most of the prominent aspects of primary visual cortex. The trans- 
form uses seven basic features (kernels), which may be regarded as three oriented edges, three oriented 
bars, and one non-oriented “blob.” Application of these kernels to non-overlapping seven-pixel neigh- 
borhoods yields six oriented, high-pass pyramid layers, and one low-pass (blob) layer. Subsequent high- 
pass layers are produced by recursive application of the seven kernels to each low-pass layer. 

Preliminary results on use of the HOP transform for image compression show that 24-bit color 
images can be codes at about 1 bit/pixel with reasonable fidelity. Future work will explore related codes 
and more detailed comparisons to biological coding, and applications to motion processing and shape 
perception. 
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Intensity Dependent Spread Processor and Workstation 


George Westrom 
Odetics, Inc. 

The Intensity Dependent Spread (IDS) is an adaptive algorithm which is modified according to the 
local intensity in die scene. (This results in a nonlinear process which cannot take advantage of rather 
nice linear transform methods.) The computation is similar to a neural net whereby intensity information 
is moving from each input pixel to a set of surrounding output pixels in a manner described by 
Corns weet and Yellott 

A prototype of a VLSI IDS processor is being developed and implemented in a workstation environ- 
ment The wotkstation consists of a SUN 3/260 and a DATACUBE pipeline processor. The IDS 
prototype is a board set which operates in the DATA CUBE processor. The SUN 3/260 performs 
control, background processing, IDS simulation and image display functions. 
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Space Environment Robot Vision System 


H. John Wood 
William L. Eichhom 

NASA Goddard Space Flight Center 


A prototype twin-camera stereo vision system for autonomous robots has been developed at Goddard 
Space Flight Center. Standard CCD imagers are interfaced with commercial frame buffers and direct 
memory access to a computer. The overlapping portions of the images are analyzed using photogram- 
metric techniques to obtain information about the position and orientation of objects in the scene. 

The camera head consists of two 510 x 492 x 8-bit CCD cameras mounted on individually adjustable 
mounts. The 16-mm efl lenses are designed for minimum geometric distortion. The cameras can be 
rotated in the pitch, roll, and yaw (pan angle) directions with respect to their optical axes. 

Calibration routines have been developed which automatically determine the lens focal lengths and 
pan angle between the two cameras. The calibration utilizes observations of a calibration structure with 
known geometry. Test results show the precision attainable is i 0.8 mm in range at 2 m distance using a 
camera separation of 171 mm. 

To demonstrate a task needed on Space Station Freedom, a target structure with a movable “I” beam 
was built. The camera head can autonomously direct actuators to “dock” the I-beam to another one so 
that they could be bolted together. 
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Self-Calibration of Robot-Sensor System 


Pen-Shu Yeh 

Goddard Space Flight Center 

The process of finding the coordinate transformation between a robot and an external sensor system 
has been addressed. This calibration is equivalent to solving a nonlinear optimization problem for the 
parameters that characterize the transformation. A two-step procedure is herein proposed for solving the 
problem. The first step involves finding a nominal solution that is a good approximation of the final 
solution. A variational problem is then generated to replace the original problem in the next step. With 
the assumption that the variational parameters are small compared to unity, the problem that can be more 
readily solved with relatively small computation effort. 
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APPENDIX 


Directory 

The following is a preliminary directory of individuals involved in NASA VST. It includes 
participants at both Ames and Langley workshops, as well as other NASA personnel, contractors, and 
students. 
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